A tool for the identification of chemical entities (CheNER-BioC)

نویسندگان

  • Anabel Usié
  • Joaquim Cruz
  • Jorge Comas
  • Francesc Solsona
  • Rui Alves
چکیده

The CHEMDNER task is a Named Entity Recognition (NER) challenge that aims at labeling different types of chemical names in biomedical text. We approach this challenge by proposing a hybrid approach that combines linear Conditional Random Fields (CRF) together with regular expression taggers and dictionary usage, followed by a post-processing step to tag those chemical names in a corpus of Medline abstracts. Our system performs with an F-score of 72.08 and 70.62% on the development and sample sets, respectively, for the CDI subtask. For the CEM subtask the performance increases to 72.61% and 73.68% on the development and sample sets, respectively.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CheNER: a tool for the identification of chemical entities and their classes in biomedical literature

BACKGROUND Small chemical molecules regulate biological processes at the molecular level. Those molecules are often involved in causing or treating pathological states. Automatically identifying such molecules in biomedical text is difficult due to both, the diverse morphology of chemical names and the alternative types of nomenclature that are simultaneously used to describe them. To address t...

متن کامل

CheNER: chemical named entity recognizer

MOTIVATION Chemical named entity recognition is used to automatically identify mentions to chemical compounds in text and is the basis for more elaborate information extraction. However, only a small number of applications are freely available to identify such mentions. Particularly challenging and useful is the identification of International Union of Pure and Applied Chemistry (IUPAC) chemica...

متن کامل

A Visual Tool for Displaying Annotations in BioC

To support data interoperability for annotation results, we developed BioC-Viewer (http://viewer.bioqrator.org/). BioC-Viewer is a web-based interactive curation tool to visualize annotation results from text mining tools and to easily curate entities and relationships with supporting BioC format. Since our focus was on a visual tool for BioGRID curators in BioCreative V BioC Track, BioC-Viewer...

متن کامل

BioQRator: a web-based interactive biomedical literature curating system

BioQRator (http://www.bioqrator.org) is a web-based annotation tool for biomedical literature. This tool was designed to support any task annotating entities and relationships. It is also one of the first web tools which support the BioC format (1) for annotation. For input, any documents in the BioC format and PubMed R abstracts can be used. For output, annotated documents can be saved in a Bi...

متن کامل

Overview of BioCreative V BioC Track

BioC is a simple XML format for text, annotations and relations, and was developed to achieve interoperability for biomedical text processing. Following the success of BioC in BioCreative IV, the BioCreative V BioC track addressed a collaborative task to build an assistant tool for BioGRID curation. For this track, we divided the whole task into 8 different subtopics including gene/protein/orga...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013